Modeling Pronunciation Variation in Conversational Speech using Syntax and Discourse
نویسندگان
چکیده
A significant source of variation in spontaneous speech is due to intra-speaker pronunciation changes. Previous work in automatic speech recognition has identified several factors that affect pronunciation variability such as phonetic context and speaking rate. This work examines new higher level information sources: syntax and discourse structure, specifically the relationship between these factors and pronunciation variation as seen in reduction and hyper-articulation. Analyses of hand-labeled data are used to determine features for phoneindependent variables characterizing pronunciation changes, which in turn are used in a decision-tree based dynamic pronunciation model. Pronunciation prediction experiments show a reduction in phone error rate of 10% over a baseline model using only phonetic context.
منابع مشابه
Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition
Modeling pronunciation variation is key for recognizing conversational speech. Rather than being limited to dictionary modeling, we argue that triphone clustering is an integral part of pronunciation modeling. We propose a new approach called enhanced tree clustering. This approach, in contrast to traditional decision tree based state tying, allows parameter sharing across phonemes. We show tha...
متن کاملFlexible Parameter Tying for Conversational Speech Recognition
Modeling pronunciation variation is key for recognizing conversational speech. Previous efforts on pronunciation modeling by modifying dictionaries only yielded marginal improvement. Due to complex interaction between dictionaries and acoustic models, we believe a pronunciation modeling scheme is plausible only when closely coupled with the underlying acoustic model. This paper explores the use...
متن کاملModeling pronunciation variation using artificial neural networks for English spontaneous speech
Pronunciation variation in conversational speech has caused significant amount of word errors in large vocabulary automatic speech recognition. Rule-based approaches and decision-tree based approaches have been previously proposed to model pronunciation variation. In this paper, we report our work on modeling pronunciation variation using artificial neural networks (ANN). The results we achieve...
متن کاملPronunciation Modeling for Large Vocabulary Speech Recognition by Arthur
The large pronunciation variability of words in conversational speech is one of the major causes of low accuracy for automatic speech recognition (ASR). Many pronunciation modeling approaches have been developed to address this problem. Some explicitly manipulate the pronunciation dictionary as well as the set of the units used to define the pronunciations of words. Others model the pronunciati...
متن کاملRate-of-speech Modeling for Large Vocabulary Conversational Speech Recognition
Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect automatic speech recognition (ASR) systems. To deal with these ROS effects, we propose to use parallel, rate-specific, acoustic models: one for fast speech, the other for slow speech. Rate switching is permitted at word boundaries, to allow modeling within-sentence speech rate variat...
متن کامل